AHUMADA: a large speech corpus in Spanish for speaker identification and verification

نویسندگان

  • Javier Ortega-Garcia
  • Joaquín González-Rodríguez
  • Victoria Marrero-Aguiar
  • Juan J. Díaz-Gómez
  • Ramon Garcia-Jimenez
  • Jose Juan Lucena-Molina
  • José A. G. Sanchez-Molero
چکیده

Speaker Recognition is a major task when security applications through speech input are needed. Regarding speaker identity, several factors of variability must be considered: a) Factors concerning peculiar intra-speaker variability (manner of speaking, inter-session variability, dialectal variations, emotional condition, etc.) or forced intra-speaker variability (Lombard effect, cocktail-party effect). b) Factors depending on external influences (kind of microphone, channel effects, noise, reverberation, etc). To cope with all these variability sources, a specific speech database called AHUMADA has been designed and collected for speaker recognition tasks in Castilian Spanish. AHlTMADA incorporates six different recording sessions, including both in situ and telephone speech recordings. A total of 104 male speakers uttered isolated digits, digit strings, phonologically balanced short utterances, phonologically and syllabically balanced rc+d text and more than one minute of spontaneous speech, so about 15 GB of speech material is available. Speaker verification results, concerning the available variability sources are also presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks

Regarding speaker identity in forensic conditions, several factors of variability must be taken into account, as peculiar intra-speaker variability, forced intra-speaker variability or channel-dependent external influences. Using ‘AHUMADA’ large speech database in Spanish, containing several recording sessions and channels, and including different tasks for 100 male speakers, automatic speaker ...

متن کامل

Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish

This paper presents and describes Ahumada III, a speech database in Spanish collected from real forensic cases. In its current release, the database presents male speakers recorded using the systems and procedures followed by Spanish Guardia Civil police force. The paper also explores the usefulness of such a corpus for facing the important problem of database mismatch in speaker recognition, u...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Verification with Shifted Delta Cepstral Features: Its Pseudo-Prosodic Behavior

This paper examines the linear relation between Shifted Delta Cepstral (SDC) features and the dynamic of prosodic features. SDC features were reported to produce superior performance to ∆ features in Language Identification and speaker recognition systems. A selection of more correlated SDC features is used in speaker verification to evaluate its robustness to channel/handset mismatch. The expe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998